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I-    INISODDCTION 

A.       AOICMATION    OF    A    SUPPLY    SYSTEH 

Cut  cf  a  need  for  a  data  automation  system  which  would 
haE.3l€  stcck  points  ar.d  inventory  control  points,  grsw  a 
need  for  a  plan  to  bring  togithsr  a  number  of  information 
systeiEs  cantarad  around  stock  point  and  inventory  control 
point  applications.  The  Stock  Point  Logistics  Integrated 
Commurica  ticns    Envircr.ment       (SPLIC3)       concept   is      that    plan. 


This    concept  involves  the  distribution      of    a    number    of 


!cal 


area  netvicrks  (LANs)  which  ccmmunicatB  via  the  Defense  Data 
Network  (CDN).  This  thesis  will  taka  a  brief  look  at  some 
aspects      cf    the      plan.  But      before    examining      ttie      SPLICE 

concept,  we  will  compare  Local  Area  Networks  and  Long 
Distanc3    Networks. 


B.       ICCAL    NETHORKS 

A  local  area  network  is  a  data  communications  system 
which  allows  a  number  of  independent  devrces  -o  comniunicate 
with  €ach  other,  including  com  outers , terminals,  mass  storage 
devices,  printers,  plotters  and  copying  machines.  A  Iccal 
network  supports  a  wide  variety  of  appiicatins  such  as  file 
editing  and  transfer,  graphics,  word  procassmg,  electronic 
mail,  database  management  and  digital  voice  [Ref.  1].  Each 
LAN  in  SELIC2  will  fcs  uniquely  configursd  and  may  include 
some  CI  all  cf  the  above  components.  The  question  to  ask 
concerning  local  area  networks  is  "What  are  the  characteris- 
tics   which    make    up    a    local    area   network?" 

According  to  A.  S,  Tanenbaum,  reference  2,  local 
networks    have   three    distinct   charact aristics: 


1.  A   diama-rr    of  no-    mere    than   a    few    kilometers 

2.  A    total    data    rata   exceeding    1    Mbps 

3.  Cwnsrshi?   by    a  single   organiza-icn 

Long  distance  networks,  on  the  other  hand,  are  ne-wcrks  like 
DDN.  A  long  distance  network  is  usually  owned  by  a  ccarnuni- 
caticns  carrier  and  is  operated  as  a  public  utility  for  its 
subscribers,  providing  services  such  as  voice,  data  and 
videc  [Ref-  1].  The  way  a  cc  Jimunica  tions  system  will  allow 
effective  message  exchange  between  differen-  ccmmunities  of 
users  within  each  local  area  network  is  beyond  the  scop.^  of 
this    thesis, 

C.       SPLICE    AND    ITS    RELATIONSHIP    WITH    OADPS-SP 

When  SPLICE  is  examined,  we  see  that  it  is  designed  to 
augment  the  existing  !Javy  stock  point  and  inventory  control 
point  ADP  facilities  which  support  the  Unifora  Automated 
Data  Piccessing  System-Stock  Points  (UADPS-SP)  [Bef,  3]. 
This  system  was  one  cf  the  first  attempts  at  standardizing 
distributad  logistical  information.  The  evolution  cf 
UADPS-S?    will   be   traced    in    the    following    sections. 

''  •      Cri^oin    o^    UAl£5-S? 

The  original  concept  of  the  distributed  processing 
of  supply  transactions,  along  with  the  maintenance  cf  stock 
r=ccrds,  was  first  tested  at  NSC  Norfolk  in  1956.  Upcn  the 
successful  completion  of  tests,  a  number  of  computers  cf 
various  sizes  and  models  were  installed  at  a  few  NSC's 
(Oakland,  Eayonne,  San  Diego),  at  NSD  Newport  and  at  NSY 
Charleston  in  1957  and  1958.  Prompted  by  a  push  tor  stand- 
ardization cf  DOD  logistics  management  systems,  in  February 
cf  1961,  the  Bureau  of  Supplies  and  Accounts  (presently 
NAVSUP)  established  a  full-time  ccumittee  to  standardize 
procadurss    as   well    as  equipment      at    major    stock    points.      The 


IBM  1410  ccmpu-^r  was  S9l5C-ei  for  Stock  Poirit  UADPS  in 
January  cf  1962.  C^cn  co  icpieticn  of  ar.  ADP  programming 
training  course,  =ach  participating  stock  point  was  assigned 
the  task  cf  analysing  and  prcgramming  a  particular  applica- 
tion.      Figure    1.1    lists   the      initial    applications    alcng   with 


Application 
A 
B 
C 
D 


F 
G 
K 


HISTORY  OF  STOCK  POINT  UADPS 


Title 

Requisition  History  and  Status 

Receipts/Dues 

Demand  Processing 

Inventory  Control  File 
Maintenance 

Financial  Inventory  Control 

Stores  Accounting 
Cost  Accounting 
Payroll 


Activity 
NSC  Bayonne 
NSD  Newport 
NSC  Norfolk 
NSC  Oakland 

NSCs  San  Diego 
and  Oakland 

NSC  Pearl  Harbor 

NSC  San  Diego 

NSC   Pearl   Harbor 
(later   changed   to 
NSY  Long   Beach) 


Figure    1.1        Initial   Applications. 

the  activity  they  were  assigned  to  in  1962.  Tc  this  list,  a 
number  of  other  applications  have  bean  added  to  date.  Of 
course,  this  was  orly  the  beginning  of  a  sys-em  which  has 
arown    and    evclved   over  the      vears.      There    have   been    numerous 
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modifications  and  alterations  zo  -he  hardware  and  sGf-war= 
of  thic  system.  There  has  also  been  a  number  of  subsystems 
added  -c  it.  A  list  of  baseline  UADPS-SP  application 
segments  or  significant  subsystems  have  been  implemented  by 
the    activities    seen    in   figure    1.2 

Today,  the  hardware  for  the  'JADPS-SP  consists  of  the 
Burroughs  nedii^ni  sized  ( E-3500/3700/4700/a80C)  systems. 
Presently,  there  ars  twenty  new  application  systems  being 
developed  which  require  considerable  interactive  and  tele- 
ccmmunicaticn  support.  The  current  JADP3-S?  cannot  support 
these  requirements  without  a  total  redesign  effort  and  will 
probably  require  future  replacement  of  the  curren-^.  main- 
frames [Eef.  3].  Nevertheless,  as  the  Navy  Supply  System 
evolves  to  meet  the  changing  fleet  needs,  PflSO  will  adapt 
and  adiust  UADPS-SP  to  meet  these  amerging  requirements 
[Ref.    U]. 

2  •      SPLICE ,    a   c  cjr£ut_?r    net  work  in    support   of    aADFS^SP 

Returning  to  the  SPLICE  concept,  it  has  been  decided 
that  the  Burroughs  computers  will  provide  background 
processinc  functions  for  largs  file  processing  and  report 
generation,  [Ref.  3].  These  are  the  same  computers  used  in 
the    UADFS-S?  system. 

According   to    reference    3, 


SPLICE  will  be  developed,  however,  usinq  a  standard  set 
of  iriniccECutei  hardware  and  software.  This  standardiza- 
tion is  carticuiarly  important  because  SPLiCE  will  be 
ijtlement^d  at  sone'sixtv  different  aeographical  loca- 
tions, each  having  a  different  mix  of  application  and 
terminal  requirements.  Additionally,  each  LAN  must  have 
the  cacabilitv  of  ccmmunioatina  witn  ether  LANs  via  the 
Defense  Data  iJetwork  (  DDN  )  ,  wh:.ch  is  to  be  provided  by 
the    Defense   Communications    Agency     (    DCA    ). 


A  layout  of  the  local  area  network  (  LAN  )  can  be 
seen  in  figure  1.3  .  Figure  1.4  ,  on  the  other  hand,  shews 
the  logical  network  concept.  Each  local  area  network  will 
include    the    following   software    modules: 
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1.  Local  Comniunication  s    (   LC   ) 

2.  ^'ational   Com  muni  cat  ions    {   NC    ) 

3.  Bicr.t-End   Pre ces sing    (    FEP  ) 

4.  'r«=rniindl    Management     (    TM    ) 

5.  Cata   3asc   Maragement    (    DBM  ) 

6.  Session    Services    (    35    ) 

7.  E€ripheral    Management     (   PM   ) 

8.  B-^scurce   Allocation     (    5A    ) 

The  above  modules  will  be  divided  into  -hose  mcdales  which 
perform  operating  fur.cticns  and  those  which  support  the 
effective    use  of   tnese   modules    on   the    local   area    network. 

D.  SiaJJEJRDIZATION    CF    SPLICE    BY    DOD 

One  of  the  objectives  of  DOD  m  SPLIC2  has  been  that  of 
standardization.  Independent      development    of      local      area 

networks  '*ould  cause  problems.  Zh^  inajcr  problem  would  be 
unnecessary  duplication  of  effort  and  continued  production 
of  unique  hardware  and  software.  A  standard  system,  or:  the 
ether  hard,  would  be  more  economical  to  design,  develop, 
maintain  and  operate.  For  a  project  the  size  of  SPLICE, 
standardization    is    the   only    wise    choice. 

E.  FDSCTIONS   OF    TH2    EATABASE    MANAGEMENT    MODOLE 

As  a  result  of  ongoing  research  in  the  implementation  of 
SPLICE,  this  thesis  will  address  those  issues  involved  with 
the  design  cf  the  Data  3as e  Management  Module  of  the  local 
area  networks.  The  functions  performed  by  the  Database 
Management  Module  as  outlined  in  reference  3  will  be  the 
folic wi re : 

1.  Filr  creation 

2.  Fil'=   update 

3.  Cuery  processing  and  data  retrieval 

U.  Data  dictionary  creation  and  maintenance 
5.  File  catalog  creation  and  maintenance 
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Figure      1.5     giv9s        an     ca-tline      of      a 
management    system  taken   fron      [Ref-    3]. 


back-end      database 
The   implementation 


cf  SfLICI  for  the  Catabasa  ^lanagataent  Module  has-  been 
discussed  in  reference  3  an d  the  conceptual  employment  cf  it 
according   to  reference  5    is: 


"The    concept    employed   in    the   recommended 
i  rrpleraentation   of    the   database   and    Terminal 
Manaceirent   Resource   requirements   for    SPLICE 
center   around   a    highly   decentralized   and 
Iccsely   coupled    distributed    local    area 
network    (  LAN   ) . " 


The    processors    for    each   software    module    within   each    LAN    will 
fce    i  ir  f  leirented    separately. 

F.        SCCPE    OF   THESIS 

In  this  thesis,  a  proposed  concepcual  design  of  zha 
database  for  SPLICE  will  be  presented.  It  will  also  address 
possitli  problem  areas  which  may  be  encountered  when  this 
nodule  is  finally  in  place  and  suggestions  on  how  these 
probl^^Tis  may  be  resolved.  Finally,  some  hardware  suggestions 
will    fce    made  for  the    future. 
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HISTORY  OF  STOCK  POINT  UADPS 


The  following  activities  have  implemented  baseline  UADPS-SP  or  significant 
subsystem/application  segments: 


UADPS-SP  ACTIVITIES 


MCAS  Cherry  Point 

MCAS  El  Tore 

MCAS  Yuma 

NAF  Washington 

NAS  Alameda 

NAS  Atlanta 

NAS  Barbers  Point 

NAS  Cecil  Field 

NAS  Corpus  Christi 

NAS  Glenview 

NAS  Jacksonville 

NAS  Lemoore 

NAS  Memphis 

NAS  Miramar 

NAS  Moffett  Field 

NAS  New  Orleans 

NAS  Norfolk 

NAS  North  Island 

NAS  Patuxent  River 

NAS  Pensacola 

NAS  Point  Mugu 

NAS  South  Weymouth 

NAS  Whidbey  Island 

NAS  Willow  Grove 

NSC  Bayonne  (Disestablished) 


NSC  Charleston 

NSC  Long  Beach  (Disestablished) 

NSC  Norfolk 

NSC  Oakland 

NSC  Pearl  Harbor 

NSC  Puget  Sound 

NSC  San  Diego 

NSD  Guam 

NSD  Newport  (Disestablished) 

NSD  Subic  Bay 

NSY  Norfolk 

NSY  Philadelphia 

NSY  Portsmouth 

NAVSUBASE  Bangor 

NAVSUBASE  New  London 

NAVSUBASE  Pearl  Harbor 

ASO  Philadelphia 

NPFC  Philadelphia 

NARDAF  Newport 

NAVMTO  Norfolk 

NAVRESUPPOFC  New  Orleans 

PMOLANT  Charleston 

PMOPAC  Bremerton 

SPCC  Mechanicsburg 

SWTPAC  Silver dale  WA 


Figure  1.2    Baseline  OADPS-SP  Application  Segments. 
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Figure    1.3        Local   Area   Network   Layout. 
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Figure    1.U        Logical   Network   Concept. 
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Figure    1.5        Outline    cf   a   Back-end  Database   aanagemenr    System. 
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II.  THE  CONCEPTUAL  DATABASE  SCHEME 

A.  DATABASES  PRESENTLY  A  PART  OF  SPLICE 

The  fresent  datatases  of  projects  under  the  umbralla  of 
the  SELICE  project  vary  greatly.  None  of  th9  databases  are 
standard.  It  is  cr.e  of  the  purposes  of  this  thesis  to 
propose  a  new  conceptual  design  of  a  database  in  the  SPLICE 
projec*  which  will  help  standardize  database  operations  for 
all  sites  involved  with  this  project.  Since  each  LAN  will 
have  a  database  manacement  modulSr  standardizing  databases 
will  allow  users  tc  query  databases  easisr  from  remote 
sites.  All  queries  could  ba  standardized  as  well.  This 
chapt€r  will  discuss  the  conceptual  design  of  such  a  data- 
base . 

B.  DEFINITION  OF  WHAT  A  CONCEPTUAL  VIEW  ENTAILS 

The  ccnceptuai  view  is  a  repres sntaticn  of  the  entire 
information  content  of  the  database,  in  a  form  that  is 
somewhat  abstract  in  comparison  with  the  way  in  which  the 
data  is  physically  stored. ..th9  conceptual  view  consists  of 
iLultiple  occurences  of  multiple  types  of  a  conceptual 
record...  the  conceptual  view  is  defined  by  means  of  thr 
ccnceptuai  schema,  which  includes  definitions  of  each  of  the 
various    typas      of    conceptual      record    [Ref.    6].  This    leans 

that  the  ccnceptuai  view  of  a  dataoase  shows  the  overall 
content  of  the  database.  The  concaptual  schema  defines  that 
view . 
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1 


Cefiniticn    of   SPLICE    database 


Th€  darabas*:  with  which  we  ar=  concsrnsd  is  a  data- 
bas<T  which  contains  information  about  parts.  These  parts  ar<= 
parts  for  ships,  airplanes,  etc.  Therefore  we  can  assume 
that  basically  -his  database  will  be  a  system  which  invento- 
ries parts.  In  a  database  of  this  type  certain  information 
is    important: 

1.  Stock   number    cr    Manufacturer's    number 

2.  Name  of    the    manufacturer,    if    it    applies 
tc    this    item 

3.  The    cost   of    each    item 

U.    The    quantity    cf    items    available 

5.  The    location    cf   the    item.    The    Activity 

6.  A    brief    description    of    that    item. 

This    is    the    minimum    aracun*      of    information    which    is    required 
for    an    inver.tcry   system. 

2  •       Approaches    used   to    Represent    a   Database 

The  next  thing  to  decide  is  the  kind  of  approach  to 
ce  used  tc  represent  the  data  in  the  database.  The  best 
known  approaches  are  relational,  hierarchical,  and  network. 
The  apprcach  proposed  in  this  thesis  will  be  the  relational 
approach.  The  relational  approach  to  data  is  based  on  the 
realization  that  fil=s  that  obey  certain  constraints  may  be 
considered  as  mathematical  relations,  and  hence  that  elemen- 
tary theory  about  relations  may  be  brought  to  bear  on 
various  practical  problems  of  dealing  with  data  in  such 
files  [Ref.  7].  Notice  the  relations  given  in  figure  2.1 
These  table-like  structures  are  called  relations.  The  rows 
cf  such  tables  or  relations  are  called  ""ruples"  and  the 
cclurans  are  usually  called  "attributes".  0ns  concept  that 
relaticnal  theory  emphasizes  and  for  which  there  does  not 
seem      tc    be      an    established      data    processing      term,      is      the 
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ccncsp*  cf  the  domain.  A  dcraain  is  a  pool  of  values  from 
which  the  actual  values  appearing  in  a  given  colunin  are 
drawn   [Bef.    7]. 

3  •      Proposed  Concep-cual    Database    Design 

In    figure    2.1   norice   relations   cf   which   the    database 
is    cciopcsed.   There    are   four    relations    to    be   considered.      The 
first    cf   th^se    is  the   Stock-Part    relation    which    includes: 
1.    Stock  Nuffiber    (Stock -Nun)     -  This    number 
is  the   Federal   Stock    number,    a 
thirteen    digit    number,    normally,    which 
is  assigned   to    all   stock   parts.    The 
stock    numbers    could    be    listed    in   a 
user's   manual    which    could    be    placed   on 
secondary    storage    cr   online.    When   the    stock 
number   list   is    updated,    an    updated   version 
of   the   user's    manual    could    be    printed. 
(Note:    The    format    of    the  Federal    Stock 
Number    is    given    below: 
l.Eigits     1    -    U      Federal    Supply 

Classification 
2.Cigits    5-6      National   Ceding 

Bureau    Numcer 
S.Cigits    7-13    National    Item    ID 

Number 
Additionally,    digits    ia    -    15   are 
usad   for    Weapon   Systems    and    Aviation 
parts. ) 
2-    Manufacturer    Number     (Mf-Nura)     -   This    is 

assigned    to  the   part    by    the    manufacturer. 
Since    there   is   no    consistent    way    of 
numbering    parts  by   manufacturers,    it   would 
te  best    if    we    did    net    us9    their    numbering 
scheme   to    inventory    parts.    There    could 
also    be   a    duplication   of  manaf acturer ' s 
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numbsrs   b-acause   of    th9    inconsisr  sncies 
Calls  ad   by    the    lack   of   standards    used   by 
mar.afacTureis.    Th  €    use    of    Fsderal   3*ock 
numbers   would    eliminate   this    problsm. 

3.  Manufacturer's   Name    (Mf-Narae)     -    The 

manufacturer's    r.ame    is    given.    Some    pcrticn 
of  it    could   be    abbreviated. 

4.  Fart   Name    (Part-Name)     -   This    gives    the 

general   category    of    a   part,    i.e.    rudder. 

5.  Cuartity    (Quantity)     -    This  is    the    nuxber 

of  parts    on   hand    at    that   particular   time. 

6.  Cost    (Cost)     -  This    is    the   cost    of    each   item. 

7.  Details     (Details)    -    This    will    give    more 

details   than  the    part   name.    Information 
such    as   the   dimensions   of    the    part     (Size, 
length,    etc.)    are    given.    The   differences 
ir  dimensicns    will   cause   the   stock    number 
to  change,    i.e.    a    1/2    inch    screw    has  a 
different    stock   number    than    a    1/U    inch 
screw . 

8.  Becrder    Point    (Hecrd-Ft)     -  This    is   the 

pcint    at    which    the   inventory   is    replenished 
for   this    part.    When    quantity    gets    below    thi. 
pcint    the    Vendor*  s   list   aust    be    consulted 
tc  reorder    stock    parts    (True    for    Government 
equipment)  . 

9.  Weight     (Wt)     -   This    is    the   attribute   which 

gives    the    weight    cf    the   part    in    terms   of 
pcunds,    i.e.    poun  ds/ parts. 

10.  Total    Weight    (Tct-Wt)    -    This    attribute 

gives    the    tctal   weight   cf    all    parts   with 
s  t  cc  k    ru  m  b  e  r . 
Note:    The    key    is   Stock-Num. 
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The  s€ccnd  relation  is  Local-Ne':.  This  relation  croviaes 
fasc  access  to  information  indicating  the  sites  where  s-cck 
parts  may  be  found.  The  attributes  included  in  this  rela- 
tion   are: 

1.  Stock   Number    (Stock-Num)     -   Saae    as    s-ock 

number   attribute    in    the   Stock-Part    relation. 

2.  Database   I.D.     (DB-Id)    -   This    is    the   number 

which    will    be   assigned    to    each    database. 

3.  Sits   Number     (Site-Nua)    -    This    is    the    number 

which    will    be   assigned    to    each    site    within 
the  SPLICE    system. 
U.    Where    (Where)    -    The    location    within    the 
SPLICE  site  of   a    particular    part,    e.g. 
Charleston . 
Note:    The    key    is   Stock-Num. 
The    third      relation    is      Vendor-List.      It      is   a      list    of      all 
venders    whc    service    this    LAN.    The    attributes    are    as    follows: 

1.  Stock   Number     (Stcck-Num)    -   Same    as    stock 

given    in    prior   relations. 

2.  Cuality    Vendor   List     (QVL)    -   This    is   the 

list    of  vendors  that    government   agencies 
are  allowed  to   procure    parts    from.    This 
list    is   predefined   and    could    be    placed    on 
secondary    storage.    When    a    list    is   needed 
it   could    be   printed    at    that    time. 

3.  Bid    1    (3id1)     -  Gives   the    name    of   the    vendor 

who   bid  on    the    par's    contract.    His    bid 
is  also   included, 
a.    Eid    2    (3id2)     -   Same    as    3id1. 

5.  Eid    3    (Bid3)     -   Also    the    same    as    Bidl. 

6.  Eid    Evaluation    (Bid-Eval)     -   This   attribute 

lists    the    vendor    who   won   the    bid. 

7.  Purchase   Orisr    (Purch-Ord)    -    This    attribute 

gives    the    purchase    order   number.    If  this 
attribute    is   known,    we    can    collect 
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historial    ir.fcraia  tion   or   da-a    ccncerning 

o  r  de  r  £  . 
8.    L€ad  Time    (Lead-Time)     -   This    attribute 

gives    the    amount    cf    time   which    is    needrd 

between   the   tinie    the   order    is    placed  and 

the  shipment  is   delivered. 
Note:    The    key    is   Purch-Ord. 
The    final      relation    is   Location-Mf.       This      relation    contains 
inforiraticn     concerning    -he      manufacturer      of      -^.he    part      and 
includes    th=   following   attributes: 

1.  Manufacturer    Number    (:if-Num)     -   Same  as 

previously    described. 

2.  Manufacturer   Location    (Mf-Loc)    -    This 

attribute    gives   the    city  the   manufacturer 
is  located    in. 

3.  Address     (Address)    -    This   attribute    gives 

the  mailing   address    of    the    manufacturer. 
U.    State    (State)    -   This   attribute    gives    the 
state    the    manufacturer    is    located   in. 

5.  Zip     (Zip)    -    This   attribute  gives    the    zip 

c:;de    in    the    manufacturer's    address. 

6.  Phone   Number     (Phone-Num)     -   This    attribute 

gives    the    phone  number,    vhich    includes 
the   area    cede,    of    the    manufacturer's 
representative   or    saleperson. 

7.  Salesperson     (Salesperson)    -    This   attribute 

gives    the    name   of    the    person   who    sold    or 
was  responsible   for    the   sale    in    a    procurement 
contract. 
Note:    The    key    is    Mf-Num. 
Thus,      we   hive    a   relational    database.       Its    data    are    net    only 
informative    but      also   historical      in    nature.         With    purchase 
order   numbers,    lead    time    and    manufacturer    information,    it   is 
possible      tc      determine      when      parts      were      delivered.         The 
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purchase  order  number  cculi  b^^  placsi  in  the  at-riba-s 
details  alcr.g  with  its  size,  etc.  The  local  network  rela- 
tions could  be  used  to  help  update  parts  within  the  SFLICZ 
system. 

Belation  Stock-Part  has  an  attribute  callrd  weight. 
This  attribute  is  the  weight  of  a  part  in  pounds/part.  When 
used  in  conjunction  with  the  total  weight  attribute,  a  quick 
check  can  be  aiade  en  the  number  (quantity)  of  available 
parts  with  this  stock  number,  i.e.  Number  of  parts  =  (Total 
Weight)  /  (Weight  in  pounds/part).  If  this  number  is  no-^.  the 
same  as  the  attribute  quantity,  there  is  a  possible  theft  or 
missing    part. 

A  number  of  queries  may  be  performed  on  this  data- 
base. The  table-like  structures  of  relational  databases  make 
the  results  of  operations  performed  oT:.  a  database  easy  to 
understand.    The    operations    included    in    this   thesis    are: 

1.  Selections 

2.  riojections 
2.    Joins 

fl  selection  is  an  operation  which  asks  for  those 
tuples  in  a  certain  relation  that  meet  a  certain  criterion. 
For  example,  using  the  Vendor-list  relation,  the  Name  and 
Bid  of  the  vendor  who  made  the  first  Bid  en  a  particular 
purchase    order    could    fca    selected. 

A  projection  is  an  operation  which  takes  a  relation, 
removes  some  of  the  attributes,  and  rearranges  some  of 
remaining  attributes,  if  necessary.  For  example,  using  the 
Local-Net  relation,  if  the  Database  ID  and  Site  Number  were 
needed  to  answer  a  query  the  information  could  be  printed 
giving  the  Site  Number  first  followed  by  the  Database  ID  as 
opposed  to  the  way  it  presently  appears  in  the  relational 
table.  Therefore  data  could  be  formatted  in  any  manner  the 
user    wanted. 
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Finally,  there  is  th?  join  operation.  A  jcin  i?  an 
opera-^icn  which  combines  data  of  two  or  more  relations-.  For 
example,  using  the  Stock-Farts  and  Location-Mf  relations, 
the  addresses  of  Eoanufacturers  with  a  particular  stock 
number  could  be  found  using  the  join  operation  in  conjunc- 
tion with  the  selection  operation.  All  of  the  abcv-3  opera- 
tions   can    be  used  to    answer    a    query    for    the   user. 

The  relations  in  this  database  are  rather  long,  but 
they  ar=  designed  that  way  in  order  to  avoid  having  lany 
joins  performed  on  relations.  3ach  relation  gives  as  much 
infer maticn  as  is  necessary  for  that  particular  relation.  If 
all  of  th€  informaticn  is  not  needed,  selections  can  be  made 
on  which  attributes  should  be  projected  by  the  user.  The 
user  would  have  nc  need  of  joining  numerous  relations 
together    in    order   to    get   the    information    he   needs. 
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Figure   2. 1        Relational  Database   Tables. 
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III.  IROELEBS  IN  THE  LOCAL  AREA  NETWORK  DATABASE  MANAGEMENT 

MOpOLE 

A.  POSITIVE  CHARACTIEISTICS  OF  A  DISTRIBUTED  SYSTEM 

The  Iccal  area  networks  in  the  SPLICE  system  are  made  of 
distributed  systems.  Some  of  the  characteristics  cf  a 
distributed  system  are  given  below.  First,  miniccmputers  are 
used  in  these  systems.  Secondly,  distributed  systems  give 
users  more  individualized  control  over  processing.  The  sche- 
duling of  jobs  and  quality  of  services  can  be  detericined  by 
the  user  himself.  Finally,  distributed  systems  can  be  mere 
readily  tailored  to  organizational  structures.  Since  nc  two 
organizations   are   exactly   alike,    flexibility    is    important. 

B.  NEGATIVE  CHARACTERISTICS  OF  A  DISTRIBOTED  SYSTEM 

There  are  also  negative  characteristics  cf  distributed 
systems.  First,  the  procedures  required  to  implement  distri- 
buted systems  are  complex.  Communications  facilities  must 
be  procured  and  computers  must  be  connected.  Data  ccmpat- 
ability  also  must  be  ensured.  Thus,  a  number  of  problems  may 
occur  in  a  distributed  environment.  These  problems  and 
possible  sclaticns  will  be  discussed  in  the  following 
sections. 

C.  EBCBIEM  AREAS 

''  •  h££l§§.    Centre!  and  Security 

Cne  of  the  biggest  prcblams  in  a  distributed  system, 
as  in  any  ccmputer  system,  is  ^.cc^es  control  and  security. 
If  data  are  online  and  there  are  multiple  users  who  may  be 
able  to  access  that  data,   care  must  be  taken  as  to  how  data 
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are  altered.  It  is  V9ry  important  tha-  -he  accuracy  of  'iata 
are    pr^s^irvsd.     Data    can    be    altered   in    two    ways: 

1,  Accidentally,   through   typing    rrrors   or 
programming    srrcrs 

2.  Intentionally,   through    malicious   misus= 
cf    a  database, 

Tc  prevent  incorrect  data  from  being  stored  in  a 
database  or  being  read  by  unauthorized  personnel,  there  are 
two  ar?as  cf  concern.  According  to  Ullman,  these  concerns 
are : 

1.  Integrity  preservation,  and 

2.  Security. 

Integrity  preservaticn  is  guarding  against  a  nonmalicicus 
error.  This  can  be  done  by  writing  a  program  in  such  a  way 
that  it  chacks  for  conflicting  records  before  anything  is 
completed  (updates,  inserts,  etc.).  Security,  cr  access 
contrcl  as  it  is  sometimes  called,  j-s  concerned  with 
restricting  access  of  users.  Only  those  persons  with  a  "need 
to  knew"  should  have  access  to  particular  data.  Therefore, 
modification  and  alteration  would  only  be  performed  by 
authorized  personnel.  These  precautions,  if  taken,  would 
allow  more  ccntrcl  over  data  and  thus  preserve  -he  accuracy 
tc  a  greater  extent.  As  a  bear  minimum,  all  online  files 
should  have  access  ccrtrol  using  account  numbers  and  accom- 
panying passwords.  These  passwords  would  restrict  the  users 
to  data  which  is  needed  only  by  him  or  ner  to  get  the  job 
done  (for  example,  read-only  passwords).  This  is  needed  just 
for  ccntrcl  cf  everyday  usage  of  the  database.  A  number  of 
additional  precautions  can  be  taken  to  further  ensure  that 
data  are  lass  vulnerable.  Encryption,  the  coding  of  data  so 
that  it  is  unintelligible,  is  oeing  used  in  the  SFLICE 
system  as  a  further  precaution  for  the  protection  cf  data. 
This  is  net  effective,  however,  unless  the  proarairs  which 
encrypt    data  are   protected    from    would-be    infiltrators. 
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2*      C en  current    Uldatjs 

A  problem  associated  with  all  database  systems  is 
that  cf  concurrent  updates.  This  is  -he  problem  which  may 
occur  in  any  system  which  has  more  then  one  user  updating 
the  sam=  file  at  the  same  time.  In  a  distributed  system, 
where  there  are  conci:rrent  executions,  there  is  always  the 
chance  of  there  being  problems  with  liveiock  and  deadlock. 
Livelcck  is  a  situation  where  there  may  be  one  or  more 
processes  waiting  on  a  locked  item.  Using  a  f irst-coae- 
first-served  strategy  cf  locking  items  asually  resolves  this 
problem.  Afterwards,  all  locks  ara  released.  Deadlcck  is  a 
situation  in  which  =ach  member  of  a  set  of  transactions  is 
waiting  to  lock  an  item  already  locked  by  another  transac- 
tion in  the  set.  Since  each  transaction  of  the  set  is 
waiting,  it  cannot  unlock  other  transactions.  Therefore  all 
cf  them  wait  indefinitely.  Detection  and  prevention  are  two 
ways    of    handling  deadlocks. 

Locks  should  be  placed  en  all  items  to  be  apdated 
before  updates  are  dene  in  order  to  solve  the  concurrency 
update  prcilem.  Certain  transactions  prevent  other  transac- 
tions frcm  accessing  a  data  item  until  that  item  is 
unlocked.  Therefore,  other  users  cannot  access  that  pcrtion 
of  the  database.  This  is  particularly  important  in  distri- 
buted environments  because  cf  the  various  locations  cf  data. 
If  the  Iccking  approach  is  used,  the  items  to  be  locked 
should  be  fairly  iarce,  maybe  even  entire  relations.  This 
would  reduce  some  of  the  costs  associated  with  the  locking 
mechanisir. 

When  viewed  over  the  entire  SPLICE  system,  databases 
are  cbvicusly  gecgraphicall y  distributed.  For  the  purposes 
of  maintaining  adequate  control  over  cataloging  files  and 
maintaining  the  integrity  cf  related  files  in  *he  database 
(synchronization      of      updating   procedures),         the         database 
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functions  are  centralized  within  each  LAN.  0T:h9r  -ban  ths- 
fact  that  seme  files  will  remain  on  the  Burroughs  Hcs-s  and 
some  files  will  migrate  to  an  interactive  D3M,  zh-z^'  is  no 
reason  tc  provide  for  the  distribution  of  databases  within  a 
LAN    [Bef-    3]. 

3 •      S vstem    Crashes 

Cne  problem  which  needs  consideration  is  hew  to 
handle  cr  prevent  system  crashes.  When  a  computer  fails,  all 
or  part  of  a  transaction  may  have  been  completed.  There  is 
no  sure  vay  to  tall  exactly  what  has  happened.  For  that 
reason,  it  is  essential  that  backup  copies  of  files  be  made 
periodically.  For  larger  databases,  copies  can  be  madi  less 
frequently  than  smaller  on  as  because  of  the  amount  of  time 
it  takro  for  copying  large  databases.  During  recovery  after 
a  failure,  a  determination  has  to  be  made  as  tc  which  *ran- 
sacticns  should  be  repeated.  For  this  reason,  a  leg  or 
journal  is  needed  which  will  contain  information  concerning 
all  chances  to  the  database  since  the  last  backup  copy  was 
made.  Recovery  from  failurs  is  particularly  a  problem  with 
online  systems.  There  may  be  no  copies  of  the  transactions. 
Therefore,  it  may  be  very  difficult  to  recreate  the  transac- 
tions. The  reprocessing  may  not  repeat  tae  exact  processing 
sequer.o?  . 

Finally,  consideration  should  be  given  to  preventing 
and  handling  sy-tem  crashes  and  of  dealing  with  whether  a 
transaction  has  been  "committed"  or  not.  This  is  the  point 
at  which  a  transaction  is  ccnsidersd  complete.  When  transac- 
tions must  be  "r=done"  or  "undone",  a  log  or  journal  which 
contains  those  which  have  committad  will  assist  in  recon- 
structing the  database.  According  to  Ullman,  [Ref.  6]  cn=^ 
could    define   a       two-chase  commit    policy    which      would   operate 
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1.  A   tiar.sactior   cannot    write   ir.zo    the    iataDase 
unxil   it    has    committed. 

2.  A  transactioE  cannot  ccmait  until  it  has 
r€ccrdad  all  its  changes  to  itams  in  the 
j  c  u  I  r.a  1 . 

All  unlocking  is  done  after  the  committed  transactions  have 
occurred.  Uncommitted  transactions  cannot  be  input  tc  the 
database.  If  a  crash  occurs,  destroyed  data  car.  he  redone 
using  the  backup  copy  of  committed  transactions.  A  message 
could  be  sent  to  the  usar  warning  him  about  transactions 
which  have  not  been  completed.  Also,  after  a  crash,  locks 
may  still  be  in  place  on  data  items,  a  recovery  routine  will 
have    to    remove    these    locks. 

^  •      5§1§  Location 

(1)  .      Glctal   or    Local. 

The  characteristics  of  data,  as  related  to 
a  distributed  system,  will  now  isa  examined.  According  to 
Kroenke,  [Eef.  8],  the  characteristics  cf  data  in  the 
distributed  environment  can  best  be  examined  by  considering 
two  questions  that  the  designer  of  a  system  must  answer. 
First,  "Where  is  the  data  tc  be  located?"  Second,  "How  will 
it    be    updated?" 

Data  can  be  either  local  or  global  in  a 
distributed  system.  local  data  is  only  needed  a-  the  local 
node  [Ref.  8].  It  is  processed  by  an  application  prcgram  of 
a  local  computer.  Since  data  is  not  used  by  other  local 
computers,  ncdes  never  request  data  from  other  nodes.  Global 
data,  on  the  other  hand,  is  needed  by  a  program  or  programs 
that  run  en  at  least  two  computers  in  the  distributed  system 
[Ref.  8].  According  to  Krcenke,  eighty  percent  of  the  data 
at  a  node  tands  to  be  local  and  twenty  percent  is  global. 
These  are  rough  guidelines.  Also,  Kroenke  feels  that  local 
data  should  remain  Iccal.  Communication  costs  are  too  high 
to    move    local   data    frcm   the    ncde    on    which    it    is    used. 
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(2)  -      Cer.traliz  sd    or    Par-jtion&d. 

It  is  not  9asy  zo  know  exactly  whsr=r  to 
put  global  data.  Tha  first  considarazion  is  whether  it 
should  b€  centralized  or  partitioned.  If  it  is  centralized, 
one  ccmputsr  in  the  distributed  network  stores  the  data.  A 
request  is  sent  to  this  computer  when  global  data  is  needed. 
Partitioned  data,  however,  is  spread  over  several  ccmputers. 
If  data  is  needed  ,  its  location  has  to  first  be  determined 
and    then   accessed. 

The  advantage  of  centralized  data  is  that 
every  ncde  knows  the  location  of  data.  lais  is  the  approach 
which  will  be  used  in  the  SPLICE  system  for  the  Database 
Management  Mcdule  in  each  LAN,  This  makes  the  system  simple. 
The  concurrent  update  problem  would  only  be  handled  by  one 
processor.  Hcwever,  if  global  data  is  needed,  it  is  possible 
that  a  perfcnance  bottleneck  will  develop  when  accessing 
data  frcm  cne  computer.  Partitioned  data  would  net  cause 
this    kind   of  problem. 

Reliability  also  has  to  be  considered. 
With  centralized  data,  if  the  one  database  computer  fails, 
all  ncdas  in  the  local  network  would  have  to  discciitinue 
processing.  When  global  data  is  needed  in  a  partitioned  data 
system  if  cne  node  fails,  all  ether  nodes  in  the  network 
could  continue  processing.  Unfortunately,  upiates  of  parti- 
tioned data  are  much  harder  to  control  [Ref.  8].  A  cen-^ral- 
ized  system  can  be  configured  to  continue  operation,  in  the 
case  cf  failure,  by  using  redundant  hardware  and  software, 
combined  with  mirrored  disk  operation  and  checkpointing. 
This  will  be  the  way  SPLICE  will  handle  this  problem. 
i^)  •      Re  plicate  d   or   Nonre plicated  . 

There  should  be  a  number  of  copies  of  "he 
data  sacred  in  the  network.  To  replicat=  centralized  global 
data,  the  entire  collection  of  global  data  is  stored  at 
several    locations   in      the   network.      When    this      is    dene,      th^- 
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nodes  n^cd  cnly  keep  lists  of  -hs  computers  having  the 
replica'sd  aa~a.  They  do  not  nsed  directories  that  show  the 
locations  of  particular  kinds  of  data;  all  of  -^.he  data  is 
located  at  each  of  the  nodes.  Also,  replicating  global  data 
€limir.at€£  the  bottleneck  and  reliability  problems  discussed 
above,  tut  introduces  the  problem  of  concurrent  update 
control   [Bef.   8]. 

When  data  is  partitioned  or  replicated, 
each  nod€  trust  have  access  to  a  directory  that  gives  the 
location  or  locations  of  each  type  of  data  [Ref.  8]. 
Therefore,  when  a  user  tries  to  access  global  data,  the 
operating  sjstem  or  DBMS  calls  upon  the  directory  to  find 
the    location  of    that    data. 

Finally,  Kroenke  states  that  the  greatest 
amount  cf  flexibility  can  be  found  with  the  replicated, 
partitioned     storage   of     data    across      a    network.  Howevsr, 

ccntrcl  is  much  more  difficult.  More  complexity  is  also 
added   to    the  operation   of  the    network, 

Wh=n  updating  data  in  a  system  that  has 
replicatsd  data,  there  is  an  issue  which  should  be  consid- 
ered. In  a  centralized,  non replicated  data  system,  an  appli- 
cation program  can  Icck  records  before  they  ar3  used.  Ihe 
lock  only  involves  data  in  a  single  computer,  whereas  with 
replicated  data  a  lock  would  have  to  be  placed  en  all 
computers  with  the  glctal  data  item.  The  problem  with  this 
is  that  if  locks  are  applied  simultan aously,  one  user  could 
possibly  have  the  record  locked  on  one  computer  while 
another  user  has  the  record  locked  on  another  computer. 
Neither  user  has  complete  control  because  both  locks  are  in 
place  for  two  different  users.  The  system  has  to  resolve 
this    conflict.       This    process   can    waste    a    lot    of    time. 

2ven  nonr eplicatsd,  partitioned  data  can 
cause  problems.  If  a  transaction  is  to  be  applied  tc  several 
different    records.         There    is    no      problem    if    this      update    is 
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done  en  cr.e  computer.  If  some  of  the  data  are  on  diff-^rTnt 
computers,  however,  there  could  be  -wo  users  iocXir.g  9ach 
ether    out. 

These  are  concurrency  problems.  The  reso- 
lution of  these  problems  is  an  ac-ive  research  area.  Even 
though  the  concurrency  updating  problem  is  of  concern,  -he 
scope  of  this  thesis  does  not  include  the  resolu-cior.  of  such 
problems. 
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IV.    THE   DATABASE    MACHINE    AS    AN    ALTERNATIVE    DESIGN 

A.       GEOWTH    POTENTIAL    CF    SPLICE    PROJECTS 

The  database  computer  (DBC)  or  ths  da-abase  machine 
(DBM)  ,  as  i-  is  soni9tinies  called,  should  be  considered  as 
one  of  the  hardware  alterna-ives  for  implemen-^a-ing  -he 
Database  Wanagement  System  (DEJ?S)  .  By  offloading  DBMS  func- 
tions frcni  the  host  application  computers  to  DBll,  applica- 
tion processing  spe=d  is  increased  and  SPLICE  application 
growth   can    be   nora    rsadily    absorbed. 

E.       CONVENTIONAL    COMFDTERS 

'' •      ^f^i^ned   usin^  LlIHS/    Ccm^lix    Software 

large  databases  of  the  future  may  need  a  DEM. 
Because  the  database  machine  is  a  special  purpose  machine, 
which  can  handle  DBMS  efficiently,  large  databases  of  the 
future  will  more  than  liicely  be  managed  by  database  machines 
as    cpposid    to   conventional    database    management   software. 

In  the  fcllcwing  paragraphs,  DBMs  will  be  examined. 
Actual  models  of  DBMs  will  be  presented  and  the  different 
approaches    tc    DBMs    architectures    will    oe    discussed. 

2  .      C e s^ane d   to    Acce ss    Data    h^  ?h  vsical    Address 

Conventional  computer  architectures  and  applications 
have  been  designed  to  refer  to  physical  addresses  in  order 
to  address  data.  With  the  increasing  number  of  applica'^ions 
which  are  centered  around  information  storage  and  retrieval, 
the  conventional  systems  are  unable  to  retrisve  information 
by  content.  This  inability  to  handle  content  addressing  has 
lead    tc      interest   in    computer      architectures    which      are    mere 
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efficcn-  in  infoimaticn  s-torags  and  retrir-val  applica*  icns. 
One  cf  •th'=  soluticns  -o  this  problem  is  th€  daiabase 
computer.  Iha  database  computer  can  be  incorporated  in-c  a 
system    in   cr.e   of   four   ways: 

1.  Eack-end   processor    for   a    host 

2.  Intelligent    peripheral   control    unit 

3.  Storage    hierarchy 
U.    Network    node. 

These  afprcaches  are  independent  of  each  other,  which  T.ay 
suggest  tha*  more  than  one  approach  could  be  included  in-c  = 
systcir's   archi tec-ure. 

C.        CATAEiSE   COMFOTEF    APPROACHES 

1  •      Eack-end   Processor 

The  back-end  processor  is  a  general  purpose  computer 
which  is  thought  cf  as  a  master-slave  conf igura-icn.  High 
level  s-ccass  requests  are  passed  lo  -ne  back-end  processor 
by    the      host   computer.  Access    vaiida-icn,         managemen-    of 

s^.crage,  update  lockout,  response  format-ing,  and  I/C  opera- 
tions are  all  performed  by  -he  back-end  processor.  After 
the  back-end  processor  is  finished,  it  passes  -he  response 
tack    to    the    host. 

2-      In- ell i gent    Peripheral 

The  intellignet  peripheral  con-rol  unit  works  in 
conjunction  with  a  irass  storage  device.  Highly  repe-itive 
da-a  accesses  are  ccoved  to  a  mass  storage  con-rcller  to 
avoid  high  overhead  on  the  host  hardware  and  software. 
Func-icns  like  device  scheduling,  head  posi~ionirg,  data 
recovery,  searching,  sorting  and  error  correction  are 
performed  by  the  intelligent  peripheral  ccn-rcl  unit. 
Functions  like  sequential  associativa  access  and  parallel 
read     (on      a    disk)       can  also      be   implemen-ed.  (      Note:      An 
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associative   computer   architecture   allows    data   to    b=    accessed 
directly   by    value   without    physical  addresses.) 

3»      Stcraqe    Hierarchy 

The  storage  hierarchy  is  similar  to  the  cache  memcry 
approach  in  thar  both  are  concerned  with  the  locality  of 
data.  The  locality  cf  access  is  such  that  data  already  used 
or  data  near  other  data  which  has  been  used,  is  very  likely 
to  be  accessed  in  the  near  future  [Hef.  9].  This  character- 
istic can  be  used  to  speed  up  the  access  cf  data  in  a  data- 
base. The  database  cache  could  oe  inserted  in  the  system 
between  main  storage  and  disk  [Hef.  9].  If  iiplemented  by  a 
sequential  access  device,  such  as  CCD  or  bubble  storage, 
access  tiire  would  be  less  than  1  msec,  if  data  is  located  in 
the  cache.  Otherwise,  the  request  would  take  longer.  The 
fact  that  data  is  managed  on  the  least-recently-used  basis, 
ensures  that  mcsr  of  the  active  data  resides  in  the  fast 
access   CCE   cr   bubble    storage. 

^ '      IzilSlK    Node 

According  to  reference  9,  The  "network  node",  yet 
another  approach  to  a  database  computer,  is  a  general 
purpose  computer  which  communicates  with  several  other  nodes 
in  the  system,  most  frequently  using  a  data  comaunicaticns 
protcccl  and  serial  channels,  but  possibly  using  I/O  chan- 
nels. The  benefit  in  of  this  configuration  is  that  several 
nodes  (hcsts)  can  access  a  single  shared  database,  thus 
avoiding  replication  of  the  data.  It  is  implemented  on  a 
general  purpose  system,  host  and  back-end,  or  prccesscr  or 
host    and    intelligent   control    unit. 
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D.  EATAEflSE   TECHNOLCGT 

Ir.  th€  past  cecade,  the  Database  Management  Sys-  =  !ii  has 
become  mere  popular.  Ihere  are  many  gains  to  be  made  through 
the  use  cf  database  technology.  The  Database  Management 
System  relieves  the  application  program  of  many  tasks.  Yet, 
the  Catatase  Management  System  has  had  its  drawbacks. 
Typically,  the  software  laden  database  management  system  has 
been  large  in  size  ar.d  complex  in  structure,  which  net  only 
overtaxes  the  host  hardware,  but  also  stresses  the  host 
operating  system  [Ref.  10].  Large  and  more  sophisticated 
applications  began  to  demand  more  speed,  capacity  and 
retrieval  flexibility  on  general  purpose  computers. 
Therefore,  it's  not  surprising  that  a  way  to  alleviate  seme 
of  the  demands  on  general  purpose  computers  has  been  sougnt. 
As  technology  improved  it  was  clear  that  a  more  efficient 
form  cf  fcack-end  processor  could  be  developed  as  a  "database 
machine",  one  specially  designed  to  manipulate  and  access 
data  with  more  flexibility  than  conventional  computers 
running    general    purpose   software   [Ref.    11]- 

E.  IHITIAL    RESEARCH    CF    DBMS 


Research  on  the  CEM  concept  started  at  Bell  Labs  in  the 
early  seventies  as  wcrk  on  a  "back-end"  DBMS  (Database 
Management  System)  using  general  purpose  computers  in  a 
dedicated  environment  [Ref.  11]-  As  a  back-end  machine,  the 
EBC  (Eatatase  Computer)  attempts  to  achieve  high  performance 
and  lew  cost  [Ref.  10].  Originally,  the  five  goals  of  the 
DEC    were: 

1.  To    design    a   machine  with   the   capability   of 
handling    a    very   large    online   database   of 
10      bytes    or    greater     (The   DBM    is    usually 
net    cost    effective    on    a    smaller    database) 

2.  To    build    a    database   computer   today    (1979) 

3.  To    have   the    DEC  compete    in    a   favorable 
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mar.nsi   with  the   existing    DBMS    as    far    as 
systam  throughput   and    cost    of    database 
s-crace   war?    ccncarned 

4.  To    make   security  an   in-egral   par*    of    th? 
DBC    design 

5.  To    provide  a   repetoire   of   very    high-lev<5l 
ccmmands    in   order   to    sufficiently   interface 
with    front -end    computers    and  support 
database    manageaient  applications. 

F.       MCDEIS    CF    DBMS 
1.      ILM    Family 

a.      IDM    500/1   and   IDM    500/2 

Eritton  Lee,  Inc.,  a  company  in  Los  Gates, 
Califcrnia,  has  developed  a  number  of  relational  database 
machines.  Among  those  develcpid  are  the  IDM  500/1  and  IDM 
500/2,  relational  intelligent  database  machines,  as  well  as 
the  IDM  System  300/600,  a  relational  database  management 
system. 

The  IDM  500/1  was  the  first  high-oer f crinance 
Intelligent  Database  Machine  (IDM)  on  the  market.  It  serves 
as  an  auxiliary  processor  to  one  or  more  host  computers  and 
is  driv=n  by  a  high-level  query  language  which  is  resident 
in  the  host.  It  handles  the  relational  database  tasks  and 
manages  dedicated  database  disks.  The  IDM  500/1  has  room  for 
expansion  for  medium  to  large  database  applications  and  is 
programmed  to  optimally  perform  retrieving,  updating, 
sorting,  etc.  The  IDM  500/2  is  basically  the  same  machine 
but  it  is  a  high-end  custom-designed  10  MIPS  database  accel- 
erator model.  It  hardies  transactions  2  to  10  times  faster. 
A  full  ccaplement  of  database  management  functions  performed 
by    bcth    irembers    of    the   IDM    family    are    listed    below: 
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1.  Easic   Ccmmancs   -    Hos-    format-csd    tc    ID>! 

specif icat icns ,    which   creat=   and    destroy 
databases,    relations    and  iniicies   to   data. 
Also   appending,    retrieving    and    replacing 
data    are    handled    in    relations. 

2.  Integrated    Data    Dictionary  -    Data    dictionary 

relations    are   automatically    maintained 
en   information   about   data. 

3.  Transaction    Management    -    Ensures    that 

user-sceci f ied   transactions   are    fully 
completed    or   backed   out   in    the    event  of 
a    failure. 
U.    Concurrency    Control    -    Allows    multiple   users 
to  safely    slccsss    the    same    database 
si  mult  aneously . 

5.  Access    Control  -    Protects   data    by    using    such 

features    ae   deny/permit   access    privileges 
and   read/write    locking    of    sharsd    data. 

6.  Audit  Logging   -   Maintains   check    pointed   audit 

cr  transaction    logs    for   auditing,    backup 
and  recovery. 

7.  3dckup    and    Recovery    -    Online    dump    facility 

supports    backup  of   disks,    databases   and 
transaction   logs    to    disk,    tape    and   the 
host.    The    database    is   recovered    via   a 
load    and    transactions   are    rolled    forward. 

8.  Bandom    Access   Files 

9.  Stored    Commands   -    Minimizes    execution   time. 

User-defined  stored    commands    are    featured. 
All    these    functions      can   be    seen    in    4.1      The    architecture   is 
modularized     and        expandable.         In      figure      4.2        the      data 
processor,    main    memory,    disk  controller    and   tape    controllers 
(optional)    are    shown. 
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Host  interfacing  is  provided  by  Doth  parall<rl  lEEE-tiSa  and 
serial  RS232C  host  interfacs  rnodulss  (Up  -^o  8  hcsxs)  .  The 
syszeni  also  has  ^xtra  expansion  slots  for  fu-ure  grcw-h.  The 
IDM  500/2  has  an  extra  function,  rha  Da-abase  Accelera-cr. 
It  is  custom-designed  and  has  an  instruction  set  which 
optimizes   relational   database    processing. 

b.       IDM     30  0/600 

The  IDM  System  300  and  IDM  System  600  are 
complete  relational  database  management  systems  for  DSC  VAX 
users  ruTLing  VMS  or  UNIX,  and  for  PDP-11  UNIX  users,  12 
Eoth  combine  an  Intellignet  Database  Machine  (IDM),  of 
Erittcn  Lee,  with  =nd-user  software  tools  in  the  host  for 
database  applications.  Included  in  the  software  are:  1)  data 
entry  facilities  2)  an  ad  hoc  online  query  language  3)  a 
report  writer  4)  picgram  language  interfaces  for  FORTRAN, 
COBOL  and  C  (programming  language)  5)  a  full  complement  of 
Database  Administration  utilities.  The  IDM  System  30C/600 
architecture  can   be    seen   in    figure   4.3 

The  functions  perfcr.-ned  by  the  IDM  System 
300/600  are  the  same  as  those  listed  above  for  the  IDM  500/1 
and  IDM  500/2.  There  is  an  additional  function,  however. 
That  function  is  Multiple  Host  Support.  It  can  be  expanded 
to  allow  several  hosts  to  access  its  databases.  This  access 
is  provided  by  the  IDM  System  300/600  UNI3US  Interface 
Packages.  Figure  4.4  shows  -he  syste.-n  architecture.  Figure 
4.5    shows   a    summary    cf   the    maximum  IDM    capacities. 

2.       iEBP    8  6/44  0 

Another      consideration      ls      the      Database      Processor 
(iDBP    86/440)    by    Intel.      It    is    a      microprocessor-based    rela- 
tional   database      management    system.       Functionally,       it      is   a 
mass    storage  controller    for    one    or   more    hosts.      Software   and 
specialized    hardware    are    included   in    the    database    manaaement 
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system    dssign.       It    is   posit  ionid      bstw  =  en    -ha    hcst(s)       sr.d   a 
s£t    cf    dedicated   disks.    4.6    shows   the    system    architsctare . 

Ihe  iDBP  provides  a  database  aanageme-it  system 
"kernel"  ipihich  supports  relational,  hierarchical  and  networtc 
databases.  It  also  provides  concurrency  control,  security, 
integrity  and  recovery  mechanisms  for  sharing  cf  data.  In 
addition  to  traditicr.ai,  record-oriented  fil«s,  the  iCBP 
also  manages  unstructured  files  which  may  contain  text, 
graphics,  digitized  voice  or  digitized  imaaes  [Ref.  13]. 
Even  though  the  iDBF  36/440  is  the  first  of  a  family  of 
datatasi  lachines  by  Intel,  it  will  continue  to  be  enhanced 
as  the  new  VLSI  component  is  integrated  into  it  in  the 
future,  figure  4.7  gives  the  configurability  cf  the  IDS?  and 
its    sjsteir    capacities. 

3.       NCAH 

Ihe  final  database  machine  to  be  examined  is  called 
NOAH,  -reduced  by  HCR  Systems  Inc.  NOAH  is  a  relational 
database  machine  which  provides  a  modular  architecture.  4.8 
gives  NOAH's  hardware  configuration.  Soma  of  NOAH's  so f -ware 
modules  reside  on  processors  dedicated  to  database  manage- 
ment and  query  language  functions  while  others  reside  on  the 
general    furtose    host.    The   functions    performed    by    NOAH    are: 

1.  Query  language    (SQL/NOAH) 

2.  Irtegratrd    Data    Dictionary 
2 .    S  e  c  u  ri  t  y 

4.    Fee  every 

Figure      4.9      gives        specifications      and      configuration 

information. 

These  were,  cf  course,  only  a  few  of  the  database 
machines    which    have    teen    designed      and    developed    in    the    last 
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G.       EXPECTED   PERFORMANCE    OF    DECS 
1  •       fldvantaqgs    of    CBCs 

A  large  number  of  database  management  fur.c-icr.s  ar? 
iinpl3ni6Et€d  in  hardware.  Since  this  is  -he  case,  EBC 
computers  are  expected  to  perform  quite  a  bit  better  -t-har 
the  computsrs  which  provide  thesa  functions  in  scf-ware. 
Software  security  enforcement  may  also  b=  absorbed  in  hard- 
ware. 

An  existing  database  may  be  supported  on  the  DEC  by 
ccnvertinq  the  database  to  conform  zo  the  DBC  r epresen-aiion 
cf  data.  This  one  time  conversion  is  known  as  ca-abase 
transformation  [ Ref .  10].  The  DBC  manufacturers  claim  that 
no  reprogramming  of  the  database  management  application  is 
necessary,  unless  th^  user  wishes  to  reformat  his  data.  An 
interface  will  translate  the  database  management  calls  into 
DEC  ccramands.  The  interface  requires  a  small  amount  cf  soft- 
ware because  EBC  commands  resemble  high-level  data 
languages.  This  process  is  known  as  query  translation.  The 
EBC  interface  package  resides  in  the  front-end  computer.  The 
interface,  together  with  the  database  corapu-er,  replaces  a 
full-scale  software  database  management  system  and  its 
conventional      disk      storage         [Ref.     10]-  The      application 

program,    however,    is    net    replaced. 

According   to    reference    10, 
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Today,  software  for  mainframe  ccmpu-ars  is 
handling  problems  such  as  reccvary  irora  failure,  concurrency 
ccn-rcl.  Gild  integrity  validation.  If  the  DBC  handles  such 
problems,  it  wcul^d  relieve  the  mainframe  sys-.em  cf  many  of 
the    database  software   functions. 

2-       Cis^dvantaqes   of    DBCs 

In  spite  of  all  the  potential  benefits  provided  by 
database  ccirputers,  there  are  some  disadvantages  which  must 
be    pointed    out.     These   disadvantages    are    the    following: 

1.  ESMs  increased   system   complexity 

2.  CElUs  load   balancing    is    difficult 

3.  EEils   will   create    training   and    conversion 
requirements    for    users 

When  a  system  is  designed,  complexity,  functionality 
and  cost  must  be  considered.  A  decision  has  to  be  made  as  to 
which  of  these  issues  is  most  important  to  an  organization. 
The  bacK-end  processor  and  its  associated  software  adds  cost 
and  complexity  to  the  total  computer  system  [Ref.  14].  The 
decision  which  must  be  made  by  the  organization,  however,  is 
whether  *hese  two  added  dimensions  will  pay  for  themselves 
in  the  future.  Also  to  be  considered  is  '.he  fact  that  -^wo 
hardware    and  software   systems    will  have    to    be    maintained. 

Another  consideration  is  the  fact  that  with  conven- 
tional hosts,  it  is  possible  to  oalance  the  load  between 
computers.  Once  a  DBM  has  been  acquired  this  is  not 
possible,  since  the  CEM  is  dedicated  to  one  task,  database 
management.  However,  it  must  be  mentioned  that  one  of  the 
main  reasons  for  acquiring  a  EEC  is  to  offloai  DBMS  frosr  the 
host    to    the    DBC. 

Finally,  if  an  organization  has  never  used  a  data- 
base computer  before,  there  will  be  a  considerable  amount  of 
time  which  will  be  needed  for  training  and  system  conver- 
sion.     The      organization   must    consider    this      and    incorporate 
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ixs  €ff€Ct  into  the  time  it  will  tak9  for  rha  syst=ra  to 
become  operctional.  Also,  the  organiza-icn  must  a.7.zLci^=iZ~ 
rssistencG  *o  the  new  technology  from  sxperienc.S'i  p^cpiq. 
These  ar?  a  few  of  the  possible  disadvantages  an  crgar.iza- 
tion  cculd  encounter  as  a  result  of  changing  its  operation 
to   include   the    database   computer   concept. 
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Figure  U.I   Database  Management  Functions  of  the  IDM  500. 
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Figure  4.4   The  IDM  System  300/600  Architecture  with  Inxerfaces. 
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SPECIFICATION 

IDM  200 

IDM  500 

Base  Contiijuration 

E^xpandable  to: 
IDM  Memory 
SibH  Storage 
Tape  Controller 
I/O  Controller 

RS-232  serial  and  /  or 
lEt.E-'lbb  parallel 
Database  Accelerator 

5  ooard  set 

in  7  slot  chassis 

1  Mbyte 

4  5MD  disks 
8  transports 

24  devices 

No 

5  board  set 

in  16  slct  chassis 

6  Mbytes 

IG  SHD  disks 
8  transports 

64  devices 
Yes  -  Optional 

1 

Relational  DBMS  Capacity 
Number  of  databases 
Relations  per  database 
Attributes  per  relation 
Tuples  per  relation 
Tuple  Width 
Indices  per  relation 
Attributes  per  index 
Index  type 

50 

32,000 

250 

2  billion 

2  ,000  bytes 

255 

15 

B'tree 

50 

32,000 

250 

2  billion 

2,000  bytes 

255 

15 

B'tree                            1 

Numter  of  Users 

128 

4,0'36 

Figure    U.5        Sumiary  of    the    Maximum    IDM  Capacities 
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Figure    4.6        The   iDBP   Systaa    Architecture. 
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Configurabiliry 

System  Feature 

Memory  size 


Host  interraces: 
(may  be  inrermixed 
per  system) 

Mass  storace  interfaces: 


Options  Available 

640K  h\Tes  of  R.AM 
IC24K  bvres  of  RAM 

One  to  sixteen  serial  links  (RS232) 
One  to  four  parallel  links  QEEE  488) 
One  or  rMo  Ethernet  links 

One  to  sixteen  SN'lD-compatible  or  Winchester  disk  drives 
One  to  four  SMD  or  U  inchester  disk  controllers 
One  start/ stop  tap>e  drive 


System  capacity 

Maximum  number  ot  files  32,767 

Maximum  file  sue  268  Mbytes 

Maximum  number  of  databases  255 

Maximum  number  of  files/ database  255 

Maximum  number  of  items/ record  127 
Maximum  tiumber  at  concurrent  sessions      254 

Maximum  strucrured  record  sire  8,1'52  bytes 
(equals  maximum  pape  si:e) 


Figure   4.7        Configuiability  of   the   iDBP   and   it*s    Sysx^ni   Capacities 
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Figure    U.8        NOAH's    Hardware    Configuration. 
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Preliminan  Soecificaiions 


Daiabasr  Type:  Relaiional 

Maximum  daia  capaciiy;  32  billion  byies 

Typical  processing  raie: 

Z-5  transactions  per  second 
10-25  with  Database  Accelerator  Option 
Maximum  numoer  ol  databases  per  IDM:  50 
Maximum  numoer  ot  relations  (lilesi  per  database: 

32. (XX) 
Maximum  number  of  domains  (fields)  per  relation: 

250 
Ma.ximum  number  of  tuples  (records)  per  relation: 

2  billion 
Maximum  tuple  width:  20(X)  bvtes 
Data  tvpes:  1.  2.  -1  bvte  integers 

1-255  byte  variable  lenctn  character  fields 
1031  dipii  paciicd  decimal 
4.  ■•  bMc  Hoating  point 
Maximum  number  of  clustering  indices  per 

relation:    1 
Maximum  number  of  non-clusiering  indices  per 

relation:    255 
Maximum  numper  of  domains  (keys)  per  index:    15 
Index  type:    B  tree 

Conrigumion  Informalion 

Base  configuration: 

•  Query  Processor 

8  Channel  Processor  Boards 
Dataoa'.e  Processor  Software  Loader 
!»(JL  Noah  Ouerv  Lancuage 
Database  \lanacemem  btuiiies 
Interlace  Sott»are  lor  Supported  Host(s) 


16  Slot  Chassis.  Power  Supplv  and  Uoiiom 

Plane 
•  Database  Processor 

Memory  Timing  an.1  Control  board 
Memory  Storage  board  (256k  h\ies) 
Disc  Controller  (supports  up  to  4  Storage 

Module  DrnesI 
Serial  or  Parallel  I/O  Channel  Board 
16-slot  Chassis,  Power  Supply  and  Bottom 

Plane 

Options 

Additional  Query  Channels 

(Suppons  Up  to  12) 
X2  Querv  Channel  Upgrade 
Report  \\  riier,  Ouerv  Channel  (Available  9/83) 
^88  IEEE  Host/ Noah  Lrgrade 
488  IEEE  Internal  Noah  Upgrade 
Tape  Controller  and  Tape  Drive 

(Supports  up  to  8  tape  drives) 

Database  Accelerator  (Typicallv  improves 

performance  by  a  lactor  of  10) 
Memorv  (256K  b\ie  Arrav  Boards)  -  up  to 

3  Megabytes  of  Storage 

Physical  Size     52"  H  X  30"  D  X  24"  W 

Noah  Enclosure     19"  W  X  22"  H  X  2'"  D 

Weight:  Noah  Enclosure  150  Lbs. 
Max.  120  Lbs.  Avg. 
Cjbmei   bav  100  Lbs. 
IDM  170  Lbs.  Max.  150  Lbs.  Avg. 

Elearical  Spec.     900  W  Max.  120  \'  60  Hz 
AC  r  IQif. 


Figure    4,9        Specifications   and   Configuration  Information. 
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^«    CONCLUSIONS 

Distributed  prccessing  and  computer  networks  are 
enabling  ccmpatsrs  to  use  programs  and  data  stored  in 
computars  at  different  locations.  The  SPLIC3  project  is  one 
of  those  projects  in  which  thrse  advances  will  be  incorpo- 
rated. Growth  in  ccmmunications ,  minicompa-^-srs  and  aicrc- 
computers  is  making  the  use  of  distributed  processing 
possifcls.  !1any  of  the  jobs  which  used  to  be  done  on  large, 
heavily  shared  computers  can  now  be  done  on  stand  alone 
miniccmputers  or  miciccompu ters  [Ref.  15].  The  advantages 
of  distributed  information  has  received  increasing  atten- 
tion. The  recognition  of  these  advantages  has  provided 
impetus  to  work  on  distributed  systems.  However,  the 
complexities  of  such  systems  must  be  investigated  and 
resolved    if   these  systems   are  to   work    effec-ively. 

This  thesis  only  covered  a  small  portion  of  SPLICE,  the 
Database  Management  f^odule.  A  conceptual  design  of  the  3ata- 
base  for  SPLICE  was  given.  Since  the  database  will  b=  cr.e 
which  will  contain  data  on  Supply  parts,  -^he  attributes 
given  in  each  relation  were  carefully  chosen  to  reflect 
inforiraticn    needed      in  an   inventory   system.  The   relations 

which  were  designed  are  long.  Shorter  relations  car  be 
joined  to  produce  the  same  attributes.  Joins,  however,  can 
be  time  ccnsuming.  Therefore,  the  longer  relations,  which  do 
not  take  as  much  time  to  provide  the  attributes  needed  to 
answer    a    query,     were    utilized. 

Among  the  problems  of  a  distributed  system  are  access 
control  and  security.  The  use  of  encryption  devices,  user 
accounts  and  passwords  are  the  minimum  in  security  features 
which    must    be   incorporated    into   SPLICE. 
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Ccncurr=r.z  updates  in  a  dis-criburei  rr.vircr.m^r. -  C3.- 
cause  a  ficfclem.  However,  ieadlocics  and  livelccks  can 
possitly  he  handled  by  using  locking  ST^ra-egies.  Besides  the 
concurrent  update  crcblsm,  recovery  from  crashes  nus-  be 
considered.  When  a  system  fails,  the  number  of  -ransactions 
which  were  completed  is  unknown.  There  mast  be  some  type  of 
logs  cr  jcarnals  which  will  help  -o  de-cermina  not  only  wnich 
transactions  were  completed,  but  also  the  source  of  the 
failure.  All  of  this  has  to  be  considered  in  light  of  the 
fact  that  other  systems  in  a  distributed  ne-^wcrk  must 
continue  to  functicn  reguardless  of  the  fact  that  one 
computer   has  failed. 

Lccking  data  is  another  problem  within  a  local  area 
network.  Whether  data  is  local  or  global,  centralized  or 
partitioned,  will  determine  the  magnitude  of  the  problem. 
Accessing  data  as  well  as  updating  data  can  be  a  big  problem 
especially    if   locks    irust    be    placed  on    the    data. 

Finally,  alternative  hardware  considerations  fcr  SPLICE 
were  discussed.  The  database  computers  were  proposed  as  an 
alternative      to        conventional      computers.  The      different 

approaches  to  database  computers  were  given  (back-=nd 
procisscr,        intelligent      peripheral,  storage      hierarchy, 

network  node)  along  with  a  brief  description  of  each.  '^Iso, 
several  mcaels  of  catabase  computers  were  presented  with 
their  functions  and  architectural  configurations.  From  the 
informaticn  presented  on  these  database  computers,  =11  of 
which  were  relational,  we  found  that  they  have  features 
which  could  be  very  useful  for  SPLICE.  The  fact  that  they 
are  able  to  directly  access  the  content  of  a  data  item  as 
well  as  being  able  to  offload  database  management  systems, 
make  them  very  attractive  alternatives.  Some  of  the  database 
computers  are  even  able  to  interface  with  existing  data- 
bases. We  must,  however,  take  into  account  the  disadvantages 
of  database  computers.  All  thing  considered,  database 
computers    seem    to   be    a   very    viable   alternative   for    SFLICE. 
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